Introduction

Currently, there are a lot of advanced analytics metrics focusing on shot quality, defending shots/goals and play during even strength. During the Tracking Data Panel at the 2022 Ottawa Hockey Analytics (OTTHAC), measuring defensive performance was posed as the current area of tracking analytics that had the biggest opportunity and the most difficult to solve. Shawn Ferris echoed this sentiment in a 2020 hockey-graphs where he said there is still much more to learn about shorthanded defense with respect to understanding optimal in-zone formations and evaluating individual players. This paper aims to model around successful defensive plays as a way to scout and identify defensive performance during penalty plays.

A successful defensive play is based on the following event:

Successful Defensive Plays:

Current event metrics assign defensive play credit for takeaways, puck recoveries and blocked shots but these events don’t account for a majority of events such as passing, zone entry and dump in/out plays. To do so, we rely on tracking movement data to identify the closest opposing player to an offensive player as a way to give credit to a defender for the outcome of the play. The assumption we are using is that the closest defender or the one most prominent in the frame is the one most likely to be contributing to the outcome of a play. Each offensive player is assigned a closest defensive player so it is possible for a defender to be guarding multiple players for a given event. We did not identify the opposing player closest to the location of the actual play given our assumption that the defender guarding the offensive player is again most likely to contribute to the outcome of the play. By merging the event data to the approximate time frame of the tracking data, we can gain valuable insight into the average location and separation at start, leading up to, during the event, and after the event. Additionally, these separation metrics also allows us to identify the closest defenders.

Modelling Approach

Using a similar approach to Jonathan Judge from Baseball Prospectus, a generalized linear mixed effects model using Monte Carlo Markov Chains (MCMC) was created to identify, which defensive player (based on closest defensive player assignment to each offensive player) had the greatest effect in adjusting the outcome of a defensive play. Our MCMC uses 5 chains and 10,000 iterations per chain to simulate and get an effective total sample size. In the mixed effects model, we fit the closest defensive player as the random effect modeling for the assumption that a defensive player can alter the result of a play, especially for penalty killing situations where a defensive play is even more important to the end result of the game. Through our exploratory analysis, we found that the following fixed variables were identified as determining a successful defensive play:

All separation and distance variables were centered and scaled to ensure a consistent relative effect of variables in the model.

Model Results & Evaluation

In viewing the model results, there is evidence (τ00) to show that there is wide variability (0.78) that exists in a defender’s ability to alter a play’s outcome. This premise gives us basis that there are specialists in the tournament that defend well in power play scenarios. This framework can help to identify those “special teams” players who can be on ice to defend against power plays. Outside of an individual players’ ability to defend, an offensive players spacing and separation from each player tends to determine the outcome of a successful play. The defender’s ability to close in on the shooter and passer has the greatest ability of causing an unsuccessful offensive play. Another observation is a fairly high Monte Carlo standard error of the variables in the model (typically want roughly less than 5%) and this is likely derived from different event types included in the data.

Defended Play Success Model Summary
Method: Bayesian MCMC GLM Mixed Effects Model | Chains: 5 | 10,000 Iterations
Coefficient Estimates Std Error CI Low (5%) CI High (95%)
Intercept -0.51 0.099 -0.70 -0.31
Separation -0.18 0.048 -0.28 -0.09
Net Separation -0.08 0.041 -0.16 0.00
Closest Opposition Player Separation 0.08 0.044 -0.01 0.16
Closest Teammate Player Separation -0.08 0.041 -0.16 0.00
Absolute x-Distance to Passer/Shooter -0.14 0.043 -0.23 -0.06
Absolute x-Distance to Event Recipient 0.11 0.048 0.01 0.20
τ00 defender_id 0.78 0.190 0.47 1.18
Data for model was centered & scaled

The following charts help to evaluate how effective our model was at fitting. The first check (on left) is on how well the posterior does in its predictions. This chart will show how Yrep (expected defended plays in the replicated simulations) compares to Y (actual defended play results). We can see that the model observed the binomial distribution of the output (successfully defended play or not) fairly well, which leads us to believe there are sufficient modeling results.

The second check is to evaluate whether the distribution has reached a convergence towards the target distribution (can be more difficult with MLE models). The second chart (on right) helps to show that variables of the model had proper convergence. We used Gelman and Rubin’s scale reduction factor or Rhat, which helps to show variance within the chains. Typically, you’d want a number near 1 and less than 1.05. The distribution of all variables and intercepts (defensive players) below seem to show a proper convergence of variables within the model. Given there is evidence of proper variance among defenders and the variables had properly converged, we can conclude that this Bayesian model can be helpful to use to evaluate defensive play during power plays.

In determining player performance relative to average, we created a “without defender” metrics, which show the probabilistic outcome of the play not accounting for a specific defender’s involvement. The average difference between the MCMC mean and “without defender” metric helps to create our Defending Plays Above Average (DPAA) metric. DPAA shows the lift in probability a defensive player has on the outcome of a play. Any play with a DPAA greater than 0 can be defined as a positively defended play.

As a result, the model tends to identify Dump In/Outs as the event that has the highest probability of a positively defended event (e.g. Dump In/Outs resulting in a turnover). Takeaways and Puck Recoveries tend to be less positively defended which may be due to a secondary defender’s ability to disrupt a play. Passes are one of the most difficult plays to defend given there is a higher probability of a pass being completed successfully than disrupted by a defender. Given this information, we can also use the DPAA metric specific for passing plays as a way to identify those defenders who were especially effective in disrupting offensive movement during a power play. Shawn Ferris said that “We know that deploying better offensive players, controlling more entries, and passing after those entries all lead to more shorthanded goals over time.” Players with a reputation of defending the zone line (“blue line”) and disrupting passing are more likely to be on the ice during shorthanded defensive play.

DPAA Avg by Event
DPAA: Defended Plays Above Average | Positively Defended Plays are DPAA > 0
Event Event Count DPAA Positive Defended Plays Positive Defended Plays %
Dump In/Out 46 0.161 41 89%
Puck Recovery 138 0.047 73 53%
Zone Entry 26 0.022 18 69%
Shot 29 0.014 19 66%
Takeaway 10 0.004 5 50%
Play 258 -0.033 158 61%
Play is a Passs

Tournament Performance Analysis

Using our model, we can evaluate the team performance during power plays as well as identify top and bottom individual performers as a way for teams to build a “penalty killer” shift.

Team Performance

USA, Finland, and Russian Olympic Committee (ROC) were able to have a average DPAA rating greater than 0, largely due to their ability to defend passes. USA is interesting case study given their low amount of passes seen but saw the highest amount of Dump In/Out (18) and Puck Recoveries (36) relative to other countries. In fact, USA had a positive DPAA average during Puck Recoveries whereas Switzerland saw negative DPAA during Puck Recoveries. As we saw prior, the ability to defend against puck recoveries shows USA’s ability to preemptively disrupt passes/shots. ROC had a high amount of passing plays defended but overall had the lowest penalty killing rate and the highest amount power plays goals against in the tournament at 8.

Canada and Switzerland were the lowest performing teams in defending plays. Canada had 50% of their goals against during power plays despite having the highest overall goal differential in the Olympics. This is due to their inability to defend passes during penalty play situations. For example, the USA was able to out shoot Canada 53-27 in their first game yet Canada was able to able to win the game (which can attribute to excellent Canadian goaltending). Overall Canada had slighter higher quality (DPAA) of plays defended against compared to Switzerland. The reason for Switzerland having the lowest DPAA average was their inability to defend shots given they had the worst DPAA shot average and 2nd lowest penalty killing play during the tournament.

To-do: Add Separation

Defended Above Average by Team Play
Data: Stathletes | Int'l Games: 2022 Winter Olympics
Defender Team DPAA Power Plays Defending Plays Positive Defended Plays % Passes Positive Defended Passing Plays %
USA 0.170 11 89 84% 26 81%
Finland 0.094 18 119 86% 63 87%
ROC 0.021 8 81 89% 49 88%
Canada -0.086 10 110 22% 65 22%
Switzerland -0.099 16 108 32% 55 44%

Given the majority of the differences in DPAA were based on how well the team defends passing, the visual below identifies a team’s approach or strategy to defending passes during penalty killing situation. We can see below that the USA tends to defend the blue line very well resulting in less passes in the offensive zone (with exception of top left side). To the contrary, most of Finland’s great defensive play was in their zone whereas they tend to not perform well in the neutral zone. ROC is an interesting case study given that they are deploying a diamond/box defensive formation and their defensive play appears to be inconsistent as passes through the formation haven’t been defended well. In scouting Canada, they tend to allow a lot of passes into the offensive zone resulting in necessary above average defending in front of the net. Canada needs to replace the top end player in their diamond/box formation or get more aggressive with defending the blue line.

Individual Performance Analysis

Top Performers

To ensure proper scouting of individual performers, simulating the posteriors (using 10,000 iterations) helps to ensure the stability of the MCMC metrics. We can see that players from Finland and USA were the top performers (even after the simulated posteriors). One reason as to why is that these players tend to be closer to the player they are guarding than the average defender (15m) with the exception of Megan Keller. Megan Keller however was the best of these defenders in the tournament in defending against passes (at least 5 passes), which makes her valuable specialist to the USA team. Although Viivi Vainikka had the highest DPAA, this was due to a majority of her plays coming from successfully defending Dump Ins/Outs (easier plays to defend against). Ronja Savolainen guarded nearly 43 offensive players and had an impressive passing DPAA at 0.263. When simulating the posteriors (using 10,000 sims), we can see that she also had the lowest relative standard deviation to her simulated DPAA mean, which helps to show her consistent play for Finland and likely a big reason for their positive defense. The top 3 DPAA (MCMC) defenders for each country (7+ plays):

Canada:

  • Marie-Philip Poulin (Center): 0.175
  • Blayre Turnbull (Center): 0.131
  • Brianne Jenner (Center): -0.009

Finland

  • Viivi Vainikka (Center): 0.488
  • Minnamari Tuominen (Defense): 0.472
  • Ronja Savolainen (Defense): 0.264

ROC

  • Oxana Bratishcheva (Center): 0.173
  • Nina Pirogova (Center): 0.108
  • Olga Sosina (Left Wing): 0.068

Switzerland

  • Phoebe Staenz (Center): 0.075
  • Noemi Ryhner (Center): 0.023
  • Lara Stalder (Center): -0.031

USA

  • Megan Keller (Defense): 0.477
  • Abby Roque (Center): 0.375
  • Amanda Kessel (Center): 0.203
Top Defending Plays Above Average
Data: Stathletes | Int'l Games: 2022 Winter Olympics | Min. 7 Plays defended; DPAA > 0
Player Pos. Plays1 DPAA2 DPAA Sdev3 DPAA Sim4 DPAA Sim Sdev5 Off. Players Guarding Avg. Defender Separation (m) Pass Plays Pass DPAA
Viivi Vainikka Center 8 0.488 0.134 0.476 0.568 14 10.06 3 0.493
Megan Keller Defense 14 0.477 0.109 0.472 0.526 34 16.71 5 0.504
Minnamari Tuominen Defense 7 0.472 0.137 0.462 0.548 15 12.37 3 0.495
Abby Roque Center 7 0.375 0.123 0.367 0.454 19 12.54 0 No Plays
Ronja Savolainen Defense 22 0.264 0.077 0.262 0.340 43 13.76 15 0.263
1 Plays include passes, takeaways, dump in/outs, zone entries,shots, puck recoveries (Avg. 8 plays per defender in overall dataset)
2 DPAA uses 5 MCMC chains
3 standard deviations for DPAA
4 Mean simulation of Posteriors using 10,000 simulations
5 Standard deviation of simulation of Posteriors using 10,000 simulations

Bottom Performers

The bottom individual performers were primarily from Canada and Switzerland with the exception of Susanna Tapani. Higher separation from the players guarding than average and total amount of players guarding (40 was 8th highest amount of players closest to) appear to be the top reason for Susanna’s performance. This is interesting given Ronja had an opposite result and guarding a similar amount of players. The player that saw the most amount of defensive plays was Jocelyne Larocque and still was able to defend at a higher rate (-0.029) than several of her Canadian teammates. This is due to what was mentioned earlier in her play close to the net. Another observation is the ROC having only 1 negative DPAA despite overall inconsistent play. There overall defensive formation can be attributed as the reason their low penalty killing rate.

Canada:

  • Rebecca Johnston (Center): -0.219
  • Natalie Spooner (Center): -0.187
  • Mich Zandee-Hart (Defense): -0.148

Finland

  • Susanna Tapani (Center): -0.227
  • Sini Karjalainen (Defense): -0.199
  • Ella Viitasuo (Defense): -0.040

ROC

  • Fanuza Kadirova (Left Wing): -0.088
  • Maria Batalova (Center): 0.014
  • Angelina Goncharenko (Defense): 0.017

Switzerland

  • Lara Christen (Defense): -0.226
  • Nicole Vallario (Defense): -0.225
  • Alina Marti (Center): -0.218

USA

  • Alex Carpenter (Left Wing): -0.143
  • Lee Stecklein (Defense): -0.060
  • Dani Cameranesi (Center): 0.003

Not listing passing plays below, given a majority/all were passing plays

To do Add more observations

Bottom Defending Plays Above Average
Data: Stathletes | Int'l Games: 2022 Winter Olympics | Min. 7 Plays defended; DPAA < 0
Player Pos. Passes1 DPAA2 DPAA Sdev3 DPAA Sim4 DPAA Sim Sdev Off. Players Guarding Avg. Defender Separation (m)
Susanna Tapani Center 14 -0.227 0.125 -0.222 -0.180 40 19.42
Lara Christen Defense 14 -0.226 0.143 -0.219 -0.154 25 14.18
Nicole Vallario Defense 8 -0.225 0.166 -0.216 -0.108 13 8.86
Rebecca Johnston Center 9 -0.219 0.194 -0.209 -0.132 15 15.93
Alina Marti Center 11 -0.218 0.186 -0.210 -0.145 19 19.08
1 Plays include passes, takeaways, dump in/outs, zone entries,shots, puck recoveries (Avg. Player had 8 plays apart of)
2 DPAA uses 5 MCMC chains
3 standard deviations for DPAA
4 Simulation of Posteriors using 10,000 simulations

High Volume Performers

Below shows a view of the

Conclusion

— List key action points and conclusion — List application to other aspects

References

https://www.baseballprospectus.com/leaderboards/catching/

https://www.baseballprospectus.com/news/article/38289/bayesian-bagging-generate-uncertainty-intervals-catcher-framing-story/

https://www.baseballprospectus.com/news/article/25514/moving-beyond-wowy-a-mixed-approach-to-measuring-catcher-framing/

http://mjskay.github.io/tidybayes/articles/tidy-rstanarm.html

https://www.iihf.com/en/events/2022/olympic-w/schedule

https://github.com/mtthwastn/statswithmatt/blob/master/hockey-with-r/gg-rink.R

https://hockey-graphs.com/2020/04/16/using-data-to-inform-shorthanded-neutral-zone-decisions/#more-24218

https://www.statsportsconsulting.com/wp-content/uploads/Deffensive-Efficiency-Metrics-DEMs.pdf

To Do -Find more defensive papers